Scientific Annals of Computer Science vol. 20, 2010 


“Alexandru Ioan Cuza” University of Iasi, Romania 


State Space Reduction for 
Dynamic Process Creation 


Hanna KLAUDEL’, Maciej KOUTNY?, 
Elisabeth PELZ?, Franck POMMEREAU! 


Abstract 


Automated verification of dynamic multi-threaded computing sys- 
tems is severely affected by problems relating to dynamic process cre- 
ation. In this paper, we describe an abstraction technique aimed at 
generating reduced state space representations for such systems. To 
make the new technique applicable to a wide range of different system 
models, we express it in terms of general labelled transition systems. 

At the heart of our technique is an equivalence relation on system 
states based on a suitable isomorphism between their component parts 
and relationships between component process identifiers. In addition, 
the equivalence takes into account new process identifiers which can 
be derived from those present in the states being compared, in effect 
performing a limited lookahead. 

Applying state space reduction based on such a state equivalence 
may produce a finite representation of an infinite state system while 
still allowing to validate essential behavioural properties, e.g., freedom 
from deadlocks. We evaluate the feasibility of the proposed method 
through extensive experiments. The results clearly demonstrate that 
the new state space reduction technique can be implemented in an 
efficient way. 

We also describe how the new state equivalence relation can be 
implemented for a class of high-level Petri nets supporting dynamic 
thread creation. 
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1 Introduction 


In a multi-threaded programming paradigm, sequential code can be run 
repeatedly in concurrent threads of execution, interacting through shared 
data and/or rendez-vous communication. The presence of thread identifiers 
in state descriptions usually accelerates the state space explosion, especially 
when new threads may be created dynamically. However, thread identifiers 
are arbitrary (anonymous) symbols whose sole role is to ensure a consistent 
execution of each thread. The exact identity of an identifier is irrelevant, 
and what only matters are the relationships between such identifiers, e.g., 
parenthood or siblinghood. As a result, (sets of) identifiers may often be 
replaced by other (sets of) identifiers without changing the resulting execu- 
tion in an essential way. This creates a possibility of identifying equivalent 
executions, which must be addressed by any verification and/or simulation 
approach to multi-threaded programming schemes. 

We aim at an abstraction technique for generating reduced state space 
representations for multi-threaded systems with dynamic process creation. 
To make it applicable to a wide range of different system models, we con- 
sider in this paper a general model-independent (behavioural) framework of 
labelled transition systems. We formulate conditions allowing one to iden- 
tify behaviourally equivalent system states and, since state equivalence is 
required to be preserved over system evolutions, to identify equivalent execu- 
tions. The equivalence relation on system states is based on an isomorphism 
between their component parts and relationships between the component 
process identifiers. Moreover, one takes into account new process identifiers 
which can be derived from those present in the states being compared, in 
effect performing a limited lookahead. Applying state space reduction based 
on such a state equivalence may even in some cases produce a finite repre- 
sentation of an infinite state system while still allowing to validate essential 
behavioural properties, e.g., freedom from deadlocks. 

Another distinguishing feature of the proposed state equivalence is that 
it is parameterised by a set of operations that can be applied to thread 
identifiers. For instance, it may or may not be allowed to test whether one 
thread is a direct or indirect descendant of another thread. The approach 
easily adapts to what is usually allowed and this can be a crucial point to 
maximise the degree of state space reduction. As shown later on, using fewer 
operations usually leads to better reduction, because process identifiers are 
more likely to be equivalent if there are fewer possibilities to compare them. 

We evaluate the feasibility of the proposed method through extensive 
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experiments in order to show that the new state space reduction technique 
can be implemented in an efficient way. We also present a concrete system 
model based on high-level Petri nets, in which the key property required of 
the state equivalence relation — its preservation over system evolution — 
holds thanks to mild syntactic restrictions placed upon the structure of the 
net. In this way, the paper prepares the ground for future applications of 
the new state space reduction technique. 

The approach presented in this paper systematises and extends our ear- 
lier work initiated in [11, 12]: in particular, the Petri net implementation of 
our approach has been dramatically simplified, all the proofs are now pro- 
vided, and we now consider an additional isomorphism algorithm and draw 
conclusions based on a rigorous analysis of our experimental measurements. 


Running example. Let us consider a server system in which 
a bunch of threads listen for connections from clients requesting 
some calculation. Figure 1 shows a message sequence chart of a 
typical session. Whenever a new request arrives, a listener thread 
creates a handler thread to process the request. The handler calls 
an auxiliary function to perform the required calculation and 
then sends the answer back to the client. Terminated handlers 
are collected asynchronously by the thread that created them. 
The client part is depicted for the sake of clarity but will not be 
considered in the model to keep things simpler. 


This example illustrates two standard ways of calling a sub- 
program: either asynchronously by creating a thread, or syn- 
chronously by calling a function. In our setting, both these 
methods amount to creating a new thread, the only difference 
is that a function call is modelled by creating a thread and im- 
mediately waiting for its termination. 


The paper is organised as follows. We first characterise the class of 
labelled transition systems for which it is feasible to apply identification of 
states inspired by the marking equivalence of [11]. We then provide ex- 
perimental results about the effort needed to verify the state equivalence 
which reaffirms our initial hypothesis that this can be done in an efficient 
way. Finally, we show how the state equivalence can be treated in a class 
of high-level Petri nets supporting dynamic thread creation. More precisely, 
for each net in this class the generated labelled transition system satisfies 
all the requirements formulated in the general setting. 
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Client Listener Handler Function 


submit (pid,req) 


create(pid,addr,req) dali pidwed) oe 


answer (pid,res) 


ret(pid,res) 


wait(pid) 
i] 


Figure 1: Running example: sequence diagram of a session. 


2 Multi-thread modelling 


We denote by D the set of data values, and by P © {x,x',7”,...} a disjoint 
set of process identifiers, (or pids for short) that allow one to distinguish 
different concurrent threads during an execution. We assume that there is a 
set I Cc P of initial pids, i.e., threads active at the system startup. A possible 
way of implementing dynamic pid creation—and one adopted in this paper— 
is to consider them as finite sequences of positive integers (written down as 
dot-separated strings, for example, 1.2 or 2.3.1). In modelled systems, it is 
not allowed to decompose a pid (e.g., to extract the parent pid of a given pid) 
which is considered to be an atomic value (a black box), nor it is allowed to 
use concrete pid values (literals). 

To compare pids within Boolean expressions, we use a set of binary 
relations on pids, Qpia we {=, <1, <,h1, h}, such that: 


e = is the equality on pids; 


e m<, 7’ (which holds if there is a positive integer 7 such that 7.1 = 7’) 
means that 7 is the parent of 7’ (i.e., thread 7 created thread 7’); 


e «<7’ means that 7 is an ancestor of 7’ (4.e., < is the transitive closure 
+). 
ty ); 


e ach x’ (which holds if there is a pid 7” and a positive integer i such 
that 7 = 7".i and mn! = 7”.(i+1)) means that 7 is a sibling of 7’ and 
m™ was created immediately before 7’ (7.e., after creating 7, the parent 
m” of m and x’ did not create any other thread before creating 7’); and 


e ath’ means that 7 is an elder sibling of 7’ (i.e., rh is the transitive 
closure (hy). 
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It is assumed that only the operators in (Q,iq can be used to compare pids. 
The above scheme has several advantages: it is deterministic and can 
be distributed to support concurrent generation of pids; it is simple and easy 
to implement; and it may be bounded by restricting, e.g., the length of the 
pids, or the maximal number of pid instances spawned by each thread. 


3 Transition systems and state equivalence 


In this paper, a labelled transition system LTS provides a description of 
all reachable states of some multi-threaded system, operating over some 
finite set of locations L, together with transitions between these states. As 
usual, there is a distinguished initial state from which all other states can be 
reached. Locations may correspond intuitively to the variables of the system 
under analysis, or to any other similar “data holder”. Each state q = (oq, "q) 
is composed of a pair of mappings: 


e amapping oy from L to finite multisets of vectors (tuples) of data and 
pids; 


e a mapping 7, which for each active thread (i.e., one which belongs to 
the domain of 7,) gives the number of threads it has already created. 


Given a state q as above, we define the following: 
e For each active pid 7 in q the next pid created (also called nezt-pid) 
is given by nezt,(7) = m.(Nq(m) + 1); 
e The next-pids of ¢ are neztpid, = {nett,(m) | m € dom(nq)}; 
e pid, is the set of all pids involved in og; and 


e (Gy, Hy) — (pid,,1q) is the thread configuration of q. 


Each transition q aan qd of an LTS is labelled by t which can be the 
name of a command or action (possibly internal) together with the actual 
parameters which may include pids present in pid,. 


Assumption 1 When moving from a state q to qd’ along a transition q zs 
qd’, the pids present in q' must either be present in q or be newly created using 
the information provided by nq (i-¢., pidy C pid, U neatpid ). Moreover, 
any pid active at q' must be active in q or be a newly created one (i.c., 
dom(nq) © dom(nq) U nextpid,). 
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Not all potential states can be considered as valid. For example, one 
should prohibit the generation of an already existing pid. This can be 
achieved by requiring that no pid involved in o, can be derived as a fu- 
ture child of an active thread. 


Assumption 2 Each state q reachable from the initial one generates a con- 
sistent thread configuration (or ct-configuration) (Gq, Hq) which means that: 


e dom(Hq) C Gq, ie., each active pid is present in the state; and 


e for alla € dom(Hy) anda’ € Go, ift.k is a prefix of n’ thenk < nq(), 
i.e., pids present in q cannot be created again. 


In the rest of the section we will introduce the notion of equivalence for 
reachable states of an LTS. First, however, we look at our example. 


Running example. An initial fragment of the LTS for the 
server system is illustrated in Figure 2, where: 


e T= {1,2}; 

L = {L, H, F} are the locations; 

e 1,2, 1.1, 1.2, 2.1, 2.2, 1.1.1 and 2.1.1 are pids; and 
e 2009, 0 and addr are data items. 


It portrays two alternative execution branches corresponding to 
the same scenario played by two different threads. In this sce- 
nario, a server calculates whether a given year is a leap year. 
A listener receives a data req = 2009, creates a handler, and 
passes to it req together with client’s address addr. The handler 
calls a function passing reg to it, the function calculates the re- 
sult res = O, and returns it to the handler. The handler passes 
on the result to the client (not modelled). Finally, the listener 
terminates the handler. 


Looking at Figure 2, one can note that the two branches, 
ty ta ts , th , th ty 
ey gs. aid), Gg ey es 


are intuitively equivalent since the roles of pids 1 and 2 can be 
swapped to obtain one from the other. Moreover, qo is equivalent 
to gs and gs. For example, gs is the same as qo except for the 
value of 7(1). 
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q2 


93 


q4 


95 


Figure 2: A possible LTS for the running example: o, is shown in the left 
and 7g in the right part of each state. 


We can now proceed with the formal definition of states equivalence. It 
should be stressed that our formalisation is more elaborate than the equiv- 
alence relations such as that defined in abstract terms in [8]. The reason is 
that our definition not only looks at the actual components of two states be- 
ing compared, but also ‘looks ahead’ including potential future component 
pids which can be created from the existing ones. Clearly, such a looka- 
head strategy needs to be carefully designed in order not to make the whole 


BOAT ioo |, 
F : 2++0 : 
create(1, addr, 2009) \ erento addr, 2009) 
L: (1) (2) lH L: (1) (2) 140 
#1: (1.1, addr, 2009) 20 HI : (2.1, addr, 2009) 21 
F: L140 FF: 2.10 
|eatt(.1, 2000) [cal(2.1, 2000) 
lHl 1H0 
L: (1) (2) ee lee ee 
H: (1.1, addr, 2009) eee H : (2.1, addr, 2009) Baie 
F : (1.1.1, 2009) Ths of [Fs (21-1, 2009) ener 
|comp(t.tt, 2009) compra. 2009) 
lHl 10 
L: (1) (2) pene L : (1) (2) I41 
H1: (1.1, addr, 2009) H : (2.1, addr, 2009) 
F: (1.1.1,0) Llilwl F: (2.1.1,0) 21-1 
Buses. L1.1H0 ane 2.1.11 0 
[ret t.0 [ret2.1.0) 
L: (1) (2) lYHel L: (1) (2) 140 
H : (1.1, addr, 0) 240 HI : (2.1, addr, 0) 2H1 
F; Llwl F: 21H1 
fenaie() [ait 
EO |iot EMO | iso 
ay 20 oe 21 
PF: F: 


approach infeasible. 
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Definition 1 Two states q and q' defining ct-configurations are equivalent 
if there is a bijection h : (pid, U nextpid,) — (pidy U nextpid,) such that 
for all relations < € {<1,<} and A € {mhy, mh}: 


# h(dorn(tnq)) = dom ng): 
e Vr € dom(ng), h(nextg(m)) = neaty (h(7)); 
e Va,7' € pid,: n < "iff h(x) < h(x’); 
e Vr,n' € pid, U nextpid,: TAT iff h(x) & h(n’); 
© dy is oq after replacing each pid by h(n). 

We denote this byq~nd (orq~ dq). 


The lookahead feature in the above definition stems from the fact that 
the mapping h used to relate the two states operates on the pids in pid, U 
nextpid, and pid, U nextpid,, and so takes into account pids which are 
not present in the states being compared. In contrast, a similar alternative 
mapping fai in the approach of, e.g., [8] would be of the form hay : pid, > 
pid,. Now, the argument in support of our approach is very strong because 
no mapping hay could in general provide a satisfactory solution. To prove 
that we do need the next-pids (i.e.nertpid, and neatpid,,) in Definition 1, 
we provide a simple example. 


Counterexample. Let g and qd’ be states such that pid, = 
{1,11}, mg) = 1, pidy = {5,5.1} and ny(5) = 3. Then con- 
sider a bijection hay : pid, + pid, such that hay(1) = 5 and 
hay (1.1) = 5.1. Everything is fine as far as preserving the par- 
enthood/siblinghood of the corresponding pids is concerned. Let 
us now imagine that the two corresponding active threads, 1 and 
5, created new pids, leading respectively to new states r and r’ 
such that pid, = {1,1.1,1.2} and pid, = {5,5.1,5.4}. Then 
the two new states are no longer equivalent, because 1.1 rhy 1.2 
yet 5.1 #, 5.4. This violates the preservation of the immediate 
siblinghood relation of the corresponding pids, and so the two 
new states are not equivalent. By including next-pids in the def- 
inition of our mapping h, we avoid the problem as q and q’ are 
no longer equivalent states. 
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To be of use for state space reduction, the equivalence relation intro- 
duced in Definition 1 should be preserved through possible executions. 


h 
Assumption 3 If q ~p qd and q — sr then sere ee wp 7 where 
h and h’ coincide on the intersection of their domains. And similarly for 
gor’. 
Note: h(t) denotes t with each occurrence of 7 € pid, replaced by h(r). 


The above assumption (which needs to be demonstrated for each con- 
crete system model such as that described in Section 6) means that ~ be- 
haves like a strong bisimulation relation and can therefore be regarded as 
sufficient, e.g., for the purpose of state reduction for deadlock detection. 
Moreover, the fact that the bijection h is preserved on the retained pids over 
the corresponding transitions means that it is also sufficient if one deals 
with properties of individual (abstracted) threads over sequences of states, 
making it compatible with the unfolding based verification technique [10]. 


4 Checking state equivalence 


Checking state equivalence proceeds in two phases. First, candidate states 
are mapped to layered labelled directed graphs (or LGs), and then the LGs 
are checked for graph isomorphism. 

LGs are constructed as follows. The first layer is labelled by locations, 
the second layer by (abstracted) vectors, the third layer by (abstracted) 
active pids and the fourth layer by (abstracted) next-pids. The arcs are 
of two sorts: those going from the container object towards the contained 
object (locations contain vectors which contain pids), and those between the 
vertices of the third and fourth layers reflecting the relationship between the 
corresponding pids through the comparisons in (7, = {<i,...,<}, where 
Oi qi8 Qpia Without the equality relation and without any relation that is not 
needed in the model (i.e., the concrete system model generating an LTS does 
not use such a relation). Figures 3 and 4 show examples with OF,, = {<1}. 
The abstraction mapping | |: DUP > DU {e} is defined as the identity on 
and as a constant mapping € on P, extended component-wise to vectors. 

Let q= (7q,Nq) be a state of an LTS. Then the corresponding labelled 
graph representation 


LG(s) = (V; A, Aa, «.-)Aa,3A) ; 
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where V is the set of vertices (which is composed of location names, vectors 
and pids), A, Ag,, ..., Ag, are sets of arcs and X is a labelling on vertices 
and arcs, is defined as follows: 


1. First layer: for each location ¢ € L such that o4(@) 4 @, ¢ is a vertex 
in V labelled by itself, z.e., @. 


2. Second layer: for each location @ € L and for each vector v € o,4(@), v 
is a vertex in V labelled by |v| and @ —> v is an unlabelled arc in A. 


3. Third layer: for each vertex v of the second layer and for each pid 7 in 
v at the position n (in the vector), 7 is an e-labelled vertex in V and 
there is an arc v 5 7 in A. 


4. Fourth layer: for each active pid 7, its potential next child, nezt,(7), 
is a vertex in V labelled by e. 


For all vertices 7, 7’ of the third and fourth layers, and for all 1 < 7 < k, 
there is an are 7 —4 7’ in Ag, iff 7 <; 7’, i.e., Aa, defines the graph of the 
relation <; on VP. There is no other vertex nor arc in LG(s). 

In diagrams, we do not show arcs for relations that can be deduced 
from the depicted ones. For instance, if there exists a path x ly y ly Z, 
then we do not depict the arc x = z nor x > y nor y ay, 


Theorem 1 Let q and q2 be two reachable states. Then LG(q,) and LG(q2) 
are isomorphic iff qi ~ qa. 


Proof: (sketch) Let LG; = LG(q) = (Vi; Ai, A 
pees 

(=) Let h: Vi + V2 be an isomorphism between LG, and LG», such 
that Vu € Vi, Ai(v) = Aeg(h(v)) and Vu, uv € Vi, Ai ((u, v)) = A2((h(w), A(v))). 
The states q; = (04,5 %q;) can easily be obtained from LG;: (i) vertices of the 
second layer represent the vectors associated to locations which are vertices 
of the first layer; this allows one to obtain o,,; and (ii) vertices of the fourth 
layer allow to retrieve 1q,, %.e., for each such vertex 7.7 € Vj, there is an 
active thread m in q and mq,;(7) = 7 — 1. Moreover, all pids (present as 
vertices of the third and fourth layer) are related through h, which is always 
the identity on data. So, the state equivalence follows. 

(<=) Let q. ~p q@. By definition, LG; and LG», only differ by the 
identity of some vertices (their number, arcs and labelling being identical). 
By definition of the state equivalence, h is the identity on data and relates 


ia, pe vig, j ri) for i € 
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L vel F \ layer 1 
(e)| [(e)]  [fe,addr,2009)| —_[e, 2009) \ layer 2 
[0 0 [0 [0 
€ € a meals \ layer 3 
\a \< ju jaa 
€ € € € \ layer 4 
L veh F \ layer 1 
(2)] [(4)] [t,addr,2009)) (1.1.1, 2009) \ layer 2 
10 \0 \0 \0 
2 1 Za eel aq ‘layer 3 
q <q q 
Y Y Y ii 
21] [12 11.2 L111 } layer 4 


Figure 3: LG of q3 and below its version with explicit vertex names included 
to improve readability. 


pids between q and q2. So A relates in the same way the identities of vertices 
in V; and Vo. 


Running example. In our server example, the siblinghood 
relation is not needed to compare pids. Indeed, the only re- 
quirement is that a parent thread waits for one of its children to 
terminate. So it is not necessary to consider hy nor Ah in QF. 
After taking this into account, the identification of equivalent 
states in the LTS of Figure 2 leads to a reduced state space. 
First, the LGs of gg and q are clearly isomorphic (see Figure 3). 
The same holds for all pairs q;, g; for 1 < i < 5. Thus, only 
one of the two execution branches would be present in a reduced 
representation, which shows how symmetric executions can be 
identified. 


Note that considering also the siblinghood relation for LG(q3) 
and LG(q3) would add extra arcs, e.g., from the node of pid 1 to- 
wards that of pid 2. This would result in losing the isomorphism 
of the two LGs. Consequently, in order to increase the reduction 
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rate, it is important to keep in id only those relations that 
actually used by the system model generating an LTS. 


va 


(e) 2) 
0 40 
E E 
ay ya 
[Ee] : 


Figure 4: LG of qo. 


Let us then consider the initial state qa whose LG is represented 
in Figure 4. It is easy to check that it is isomorphic to the LGs of 
qs and gq. Thus, the infinite behaviour that is present in each ex- 
ecution branch can be reduced to a loop because we consistently 
abstract away the information about newly generated pids. 


5 Experimental results 


As the complexity of checking graph isomorphism is in general unknown, 
it is essential to evaluate how efficient in practice it can be verified in the 
case of checking the state equivalence defined in the previous section. We 
therefore devised a systematic and thorough series of experiments aimed at 
estimating the complexity of checking the isomorphism of graphs generated 
by two states. We started by observing the following: 


Checking isomorphism will, in general, be very efficient when the 
states being compared differ significantly, e.g., when the number 
of non-empty locations or pids involved is different. 


The reason is that in such cases the non-equivalence can be detected, e.g., 
in linear time w.r.t. the number of locations. In particular, the experiments 
should not consider pairs of independently generated states, as they are likely 
to differ by a wide margin. Consequently, the adopted testing methodology 
consisted in generating a random state q and then comparing it with a num- 
ber of similar (both equivalent or non-equivalent) states obtained through a 
number of transformations modifying the original state q. 
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The above methodology stems from our overall aim not to provide a 
verification tool but instead a method to be incorporated into such a tool. 
Using a set of verification case studies would not be appropriate for our 
purpose because this could introduce an undesirable bias. Indeed, any case 
study would necessarily yield states of a particular shape that could influence 
the efficiency of isomorphism checking. Furthermore, one would expect that 
any set of case studies contains balanced subsets of favourable, neutral and 
unfavourable cases. Considering randomly generated states can therefore be 
viewed as considering an arbitrarily diversified set of case studies. Moreover, 
as explained above, we have considered only similar states and thus only 
the less favourable cases from the point of view of our approach. As a 
consequence, our results can be seen as concerning the worst cases of any 
specific case study. Finally, running effective case studies would imply much 
more than checking isomorphisms, which would also introduce bias since 
the overall performance would then be influenced by many more (difficult 
to control) factors. 

To illustrate transformations used in our experiments, let us consider 
the following example with two locations, each location comprising two vec- 
tors: 

8 S22 (3, 2.1:0) 
EPs CEO BAO: 922505) 


where each pid is followed by a colon and the number of threads it has 
already created. The transformations we applied were as follows: 


e Transform component vectors so that the resulting state is equivalent: 


L : (2.3.2.5:5,2) (3, 2.3.2.5.4:3) 
L! : (2.3.2.4:3,3,2.3.2.7:3) (2.3.2.5.5:3) 


e Exchange vectors between locations: 


E40 X38, 21:0) %1:0,3;4:0)} 
EP ee TDD) SIDS) 


e Exchange components within vectors: 


De BOY 2982) (2:1:0,;3) 
Ef + -€4:0;3,1:0 ) -(2.2:0) 
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e Exchange data and/or pid components between vectors: 


L : (2:2,3) (2,2.1:0) 
L' OL0, 34:0): (2240) 

Ee 3 42:20, 2) (ay 22) 

LY! > (4:0, 3, 2:2:0) (1:0) 

BE + (21:0,3) (2, 2:2) 

L' (4:0,3, 2.2:0). (1:0) 

e Replace pids: 
L : (26.25.25:0,2) (3, 32.32.33:1) 


EF & (35.35,34:0;3, 32.32.3331). (32.32,33,1:0) 


e Increment the count of created pids: 


EL + (24,2) U3; 21505 
Ef s (10,3, 4:4) (2.2:35 


All these transformations were randomised with suitable parameters in order 
to control, e.g., the number of exchanged pids. 


In order to check graph isomorphism, we used NetworkX [5] implement- 
ing the VF2 [4] algorithm, as well as Sage [17] implementing the Nauty [14] 
algorithm that is usually regarded as the most efficient one. We carried out 
more than two millions of comparisons, on the basis of which we observed 
that the computation time: 


e deteriorates with the state size and with the percentage of pids in 
vectors (w.r.t. data); and 


e improves with the increase of distinct data values. 


This should not be surprising. In particular, the presence of data leads to 
labelled nodes that can be quickly matched when comparing two graphs. We 
further observed that Nauty is more efficient than VF2, and so the analysis 
below is based on the results obtained for Nauty. 


The experimental results are depicted in Figure 5 which shows the com- 
putation time ¢t with respect to the number p of distinct pids in the system. 
We can clearly identify there two distinct components (groupings of mea- 
surements): 


144 


e the lower part which looks linear, or slightly sub-linear; and 


e the upper part which visually fits in-between O(p?) and O(plogp) 
curves. 


Each component was analysed separately using the same method. 

We first extracted the components as vectors of indexed pairs (pj, t;) 
with t; being the observed computation time for a given number p,; of pids 
(see also Figure 5): 


Ci © [(pi,ti) | (1 Sa < bi) A (pi & 200) A(t < 3385)| 


Co = |(pi,ti) | (LS i < he) A (pi > 200) A (ti > RH)] 

where t; = aes corresponds to the straight dotted line in Figure 5, and 
p; = 200 is where the two components are clearly separated. We also ensured 
that both Cy and C2 are ordered so that p; < p; whenever 7 < j. As it turned 
out, C, comprises 78% of the observations for p > 200, while only 22% are 
in Co. 

Let us assume that within the components C; and C4, t can respectively 
be characterised by O(p) and O(plogp). To verify this assumption, we 
computed two more vectors: 


DS FE | (pi, ti) € Ci] and Dy = | (pi, ti) € C2 


ty 
pi log p; 
By drawing a histogram of each D; we checked that its values were grouped 
around 0.0045 for D;, and 0.0075 for Dz. To gain more confidence about this 
distribution, we sub-divided each D; into successive segments (corresponding 
to growing p;s) and drawn their histograms. Then, taking D; as an example, 
we have that: 


e if t = O(p) then the histograms should be similar for each segment; 


e if t > O(p) then the histogram of each segment should be right-shifted 
w.r.t. that of the previous segment; and 


e if t < O(p) then the histogram corresponding of each segment should 
be left-shifted w.r.t. that of the previous segment. 


Figure 6 show the histograms for D, and D2 with 4 segments. We can now 
make the following observations (which turned out to be the same for other 
numbers of segments that we tried): 
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Figure 5: Top: observed computation times ¢ w.r.t the number p of unique 
pids in compared states (darker points correspond to more frequent obser- 
vations). The straight line is O(p), the upper curve is O(plogp) and the 
lower curve is O(p*). Bottom: two components, C; (left) and C2 (right). 
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e the left column confirms that t = O(p) within C1; 


e in the middle column, there is a progressive right-shift of the his- 
tograms when we go down the column, and so t > O(plogp) within 
C2; and 


e the right column showing the histograms for Dj L | | (pi, ti) € Co 
confirms that t < O(p*) within Co. 


This provides a full justification of the earlier visual observation made for 
Figure 5. Moreover, since in the right column the shifting is faster than in 
the middle one, it follows that for D2 t is closer to O(plog p) than O(p?). 


As a result, we are in a position to conclude that the cost of checking 
state equivalence is very low for states that differ considerably (in practice, 
a majority of the compared pairs), and is still very good for states that are 
equivalent or similar: for 78% of them, it is linear with respect to the number 
p of unique pids in the state, and for the remaining 22%, the computation 
time is slightly higher than O(p log p). 

Moreover, it should be noted that discovering equivalent states allows 
one to limit the state space exploration. Potential reduction may also al- 
low to analyse systems with infinite state spaces, or state spaces which are 
too large to fit into the computer’s memory, and thus would have been in- 
tractable regardless of the computation time. 


6 Petri net implementation 


In this section, we will introduce a class of high-level Petri nets* for which 
the approach described above always works. More precisely, the syntactic 
restrictions imposed on these nets will imply Assumptions 1-3, and so for 
each net in this class the generated labelled transition system will satisfy all 
the requirements formulated in the general setting. 

The kind of (finite) high-level Petri net we have in mind is a tuple 
NS (S,T, A, Mo) which consists of a set S of places, a set T of transitions 
(disjoint from S$), a labelling A of places, transitions and arcs (in (S x T) U 
(T x S')), and an initial marking Mo (which is a mapping that associates to 
each s € S a finite multiset of values in A(s)) such that: 


“We assume that the reader is familiar with the basic notions concerning high-level 
Petri nets [13]. 
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Figure 6: Left column: the j-th row from the top (0 < 7 < 3) shows the 
distribution of a for (p;,ti) € D1 such that iky <i< Ht ky (i.e., the j-th 
segment of D,). Middle and right columns: distributions for 
respectively, for (p;,t;) € Do. 
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1. For each place s € S, the type of s, A(s), is a Cartesian product 
X1X-+++x X, (k > 1), where each X; is P or D. 


2. The set of places of N contains a unique generator place s, having 
the type P x N. It is used by the underlying scheme to implement the 
mapping 7 that is needed when new threads are spawned. For each 
active thread identified by 7, the generator place stores a token (7, 2) 
where 7 is the integer counter of the child threads already spawned 
by a. Thus the next threads to be created by a will have the pids 
m.(i +1), 1.(2 + 2), ete. 
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3. We assume that the initial marking Mo of N is such that the generator 
place contains exactly one token, and all other places are empty. This 
implies that I is reduced to a single element.° 


4. For each transition t € T, the annotation on the arc from the generator 
place to t is a set of the form 


A(Sn,t) = {(p1, Ci) nee (Pk, Cr) } 


where & > O and all the p;’s and c;’s are distinct pid and counter 
variables. The annotation on the arc from t to the generator place is 
a set of the form: 


(p1,€1 +11),---;(Pms€m +m), 
A(t, 8) af (p1.(c, + 1),0),..., (pi-(e1 + 71), 0), 


(pa-(ck +1), 0), «+5 (De-(ee + 1), 0) 


where m < k, and n; > 0 for all j.° An empty arc annotation means 
that the arc is absent. 

Below we denote by II; the set of all the pid expressions p;.(c; + J) 
used in A(t, S,). 


5. For each transition t € T and each place s € S' \ {s,}, the annotation 
on the arc from s to ¢t is a multiset of vectors built from variables and 
data values, and the annotation on the arc from t to s is a multiset of 
vectors built from expressions involving data variables and data values 
as well as elements from Il; U {p1,...,pm}. 


6. For each transition t € T, A(t) is a computable Boolean expression, 
called the guard of t, build from the variables occurring in the anno- 
tations of arcs adjacent to t and data values. The usage of pids is 
restricted to comparisons of the elements from I]; U {pj,...,p,} using 
the operators from Qpia- 


°This is not a restriction in practice since any desired marking can be obtained by 
initially firing a special transition which inserts the required tokens into the places of the 
net N. 

°Note that the first row (pi,ci1 +71),...,(pm,;€m +m) corresponds to those pids 
pi,---,;Pk Which remain active after the firing of t, and the remaining rows correspond to 
newly created pids. If nj = 0 then the whole row (p;.(cj + 1),0),..., (pj-(ej +;),0) is 
absent, 7.e., no child pids were created for p;. 
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Notice that the transitions of such nets manipulate the pids in a con- 
trolled way. In particular, they may create new pids only from existing and 
active ones, and only following a precise creation scheme (ensured by the 
connection to the place sy). 

A binding 6 of a transition t € T is a mapping from the variables 
occurring in the guard of t and the annotations of arcs adjacent to t to 
concrete values of the corresponding types. We use 3(e) to denote the result 
of evaluating an expression e under the binding @. Then, t is enabled at a 
marking WM if there is a binding 6 such that the following hold: 


e for all s € S, G(A(s,t)) < M(s), i.e., there are enough tokens in the 
input places of t; 


e B(A(t,s)) is a multiset over A(s), i.e., the types of the output places 
of t are respected; and 


e G(A(t)) evaluates to true. 


Such an enabled transition t may fire producing a new marking M’ 
such that, for all s € S, we have: 


M'(s) = M(s) — B(A(s,t)) + B(A(t, s)) - 


We denote this by M[t,3)M’. Then a marking M is reachable if it can be 
derived from Mo by firing a finite sequence of transitions. 

In [11] we also introduced a class of high-level Petri nets aimed at 
modelling systems with dynamic process creation. However, the syntactic 
rules there were over-complicated and so it was one of our objectives to 
simplify the constraints imposed on suitable Petri nets. The resulting class 
supports much more refined way of modelling systems. 


We will now present results validating our claim that the just defined 
class of high-level Petri nets satisfies Assumptions 1-3. To start with, in 
this particular case, states correspond to reachable markings of N. For each 
such a marking M, the corresponding state gay = (om.n) is given by: 


OM {(s, M(s)) | 5 € S} 
nm = {n++k|(n,k) € M(s,)}, 


I& [1 


and the generated ct-configuration is ctcyy = (G,H), where H = ny and 
G is the set of all pids occurring in the marking M. Note that ny is a 
well defined function since no pid occurs more than once in s, which can be 
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shown in a similar way as Theorem 2 (see the Appendix). As a result, qv 
and ctcy are also well defined. 

We may now observe that Assumption 1 is guaranteed by the restric- 
tions on the form of annotations on the arcs adjacent to a single transition 
(in particular items 4 and 5 in the definition above). As to the remaining 
two general assumptions, we have the following. 


Theorem 2 (Petri net rendering of Assumption 2) Let M be a reach- 
able marking of N. Then, ctcyy is a ct-configuration. 


Let M and M’ be reachable markings such that qu ~p gy. Then the 
two markings are also h-equivalent, M ~, M’. 


Theorem 3 (Petri net rendering of Assumption 3) Let M and M' be 
h-equivalent reachable markings of N, and t be a transition such that M{t, 3)M. 


Then M'[t,ho B)M', where M’ is a marking such that M ae M!’ for a bi- 
jection h coinciding with h on the intersection of their domains. Moreover, 
the result still holds if Qpia ts restricted to any of its subsets which includes 
pid equality. 


Both theorems are proved in the Appendix (note that neither of these 
proofs was previously published). 


7 Comparison with other approaches 


A variety of methods such as those in [7, 2, 9] (see, e.g., [16] for a recent 
survey) address the state explosion problem by exploiting, in particular, 
system symmetries in order to avoid searching parts of the state space which 
are equivalent to those that have already been explored. Some methods, such 
as [1, 6, 15], have been implemented in widely used verification tools and 
proved to be successful in the analysis of, e.g., communication protocols and 
distributed systems. The method proposed in this paper actually focuses 
on abstracting thread identifiers, while symmetries are addressed indirectly. 
So, in addition to reducing symmetric executions, it can also cope with 
infinite repetitive behaviours. Moreover, it should be noted that symmetry 
reduction techniques rely on the computation of a canonical representation 
of each state. It has been shown in [3] that this is as hard as checking graph 
isomorphism. So, our method, while capturing symmetry reductions, is not 
computationally more complex. 
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It is worth noting that our approach could be in principle combined with 
any framework that offers symmetry reductions. To this end, one would ba- 
sically need to encode explicitly into the states of the model the information 
captured by the layered graphs we generate for checking states equivalence. 
In particular, this would require to encode and maintain the relation between 
active pids and the pids of their next siblings, as well as the comparisons 
considered in 27,4. This would greatly complicate the modelling but would 
ensure that symmetry discovery is consistent with the constraints we require 
in Assumptions 1-3. 

When applied to Petri net markings, our approach can be more specif- 
ically related to that developed in [9] (which, incidentally, covers those 
in [7, 2]) where a general framework was proposed aimed at reduced state 
space exploration through the utilisation of symmetries in data types. More 
precisely, [9] defines three classes of primitive data types. Two of these, or- 
dered (for which symmetries are not considered) and cyclic (which are finite) 
cannot lead to reductions based on pids. The remaining one, unordered, can 
in principle be used for reduction even for infinite domains. An interesting 
aspect of the approach developed in [9] is that, in principle, it allows a defi- 
nition of data types taking into account various comparisons between pids. 
This could alleviate the encoding of the layer graphs within the model as 
suggested above. Only the encoding and maintaining of the relation between 
active pids and their next siblings would then be necessary. 

Finally, any reduction method based on identifying equivalent Petri net 
markings falls into the category of equivalence reduction as defined in [8]. 
This method requires an equivalence specification to be consistent [8, def. 1], 
i.e., preserved along the executions of a system, which in our case has been 
split over Assumptions 1-3. However, it is important to note that [8] defines 
a general notion but does not provide any help in finding suitable equivalence 
relations and proving their consistency with respect to state space reduction. 
On the contrary, it is even stressed in [8] that this is a particularly difficult 
task, which, still according to [8], explains why equivalence reductions are 
not often used and why less general symmetry reductions are preferred. 


8 Concluding remarks 


In this paper we presented an abstraction technique for generating reduced 
state space representations of dynamic multi-threaded computing systems, 
aimed specifically at alleviating problems resulting from dynamic process 
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creation. To make the new technique applicable to a wide range of different 
system models, we based it at the behavioural level on general labelled tran- 
sition systems which have, in particular, to satisfy Assumptions 1-3. Such 
an approach is practical as labelled transition systems of the this kind can 
be obtained through introducing mild syntactic restrictions on the system 
model generating them, as shown in Section 6. That is, the introduced class 
of high-level Petri nets does generate labelled transition systems satisfying 
the conditions required in first part of this paper. What is more, similar re- 
strictions can be introduced for other system models which support explicit 
thread creation and manipulation. 

The proposed technique is based on an equivalence relation between 
system states. It takes into account new process identifiers which can be 
derived from those present in the states being compared, in effect performing 
a limited lookahead. We demonstrated that checking state equivalence can 
be implemented in a very efficient way by reusing well known algorithms for 
graph isomorphism and their publicly available implementations. 
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A Additional results and proofs of Theorems 2 and 3 


In what follows, the function H occurring in a ct-configuration ctc = (G, H) 
is often treated as a set of pairs. We first observe that being a ct-configuration 
is unaffected by pid deletion. 


Proposition 1 Let (G,H) be a ct-configuration. If dom(H) C G’ C G and 
H’ CH, then both (G’, H) and (G, H’) are ct-configurations. 


Proof: Follows directly from the definition of a ct-configuration. 


The next result captures the change of a ct-configuration ctc = (G, H) 
after the spawning of a single new thread. The idea is that we first select 
m in the domain of H and then replace the unique (7,7) in H by (7,i +1), 
and respectively add 7.(¢ + 1) and (7.(i + 1),0) to G and H, leading to: 


cte™ #(GU{n (i+ D}, H\{(m, 0} U{(n,6+ D), (7-6 + 1), 0) }) . 


In the proofs below we use CTC1 and CTC2 to respectively refer to the 
first and second part of the definition of a ct-configuration. 


Proposition 2 The ctc™ above is a ct-configuration. Moreover, 7.(i+1) ¢ 
G. 


Proof: Let ctc™ = (G’, H’). We first observe that, by CTC2 for ctc, 
m.(i+1).7 ¢ G, for all r. («) 


Hence the second part of the result holds. Moreover, by (*) and CTC1 for 
ctc together with dom(H") = dom(H) U {z.(i + 1)}, we obtain that H’ is a 
function. 

It is clear that ctc™ satisfies CTC1 since ctc satisfies CTC1 and we have 
G’ = GU {z.(i+ 1)} and dom(H") = dom(H) U {2.(i + 1)}. 
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To show that CTC2 is also satisfied, we proceed as follows. Since ctc 
satisfies CTC2, if the same is not true of ctc™, then at least one of the 
following three cases must hold. 


Case 1: 7.(i +1) = 7'.k.7 and k > m, for some (x’,m) € H \ {(z,1)}. If 7 is 
empty, then 7 = x’, producing a contradiction with the choice of (z’,m) and 
H being a function. Hence t = 7’.n, and so we have that 7 = 7’.k.7’ € G, 
producing a contradiction with CTC2 for ctc. 


Case 2: k > 171+1, for some 7.k.r € G. Then also k > 1, producing a 
contradiction with CTC2 for ctc. 


Case 3: 1.(1+1).7 € G, for some r. Then we immediately obtain a contra- 
diction with (+). 


We can now prove that Assumption 1 holds for the class of Petri nets 
introduced in this paper. 


Sketch of the proof of Theorem 2: Follows from the fact that ctcjy, is 
a ct-configuration and from the definition of annotations on arcs adjacent to 
a transition t. More precisely, they imply that if Mt, 6)M’ then ctcj,y can 
be derived from ctcyy by zero or more applications of Proposition 2 followed 
by at most two applications of Proposition 1. 


We now turn to the properties of equivalent markings. In what follows, 
two ct-configurations such as in the definition of h-equivalent states will 
also be called h-equivalent. Moreover, next(H) = {m.(i + 1) | (z,2) € A}, 
for every ct-configuration (G,H). We can then show that h-equivalence of 
ct-configurations is preserved by coherent deletions of pids. 


Proposition 3 Let (G,H) and (G',H’) be h-equivalent ct-configurations, 
and dom(H) CG CG and H CH. Moreover, let G = h(G), and let H_ be 
obtained from H' by deleting each (m,i) such that x ¢ h(dom(H)). 

e (G,H) and (G,H’) are h-equivalent ct-configurations, where h is h 


restricted to GU next(H). 


e (G,H) and (G',H) are h-equivalent ct-configurations, where h is h 
restricted to GU next(H). 


Proof: By Proposition 1, (G,H), (G, H’), (G,H) and (G’,H) are all 
ct-configurations. Then the result holds as in each case the new bijection is 
a restriction of h. 
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Proposition 4 Let ctc = (G,H) and ctc’ = (G’,H’) be h- gander ct- 
configurations, and let (7,7) € H and (n',j) € H’ be such that x’ = h(n). 
Then (G,H)" and (G", H')™ are h-equivalent ct-configurations, for some 
extension h of h. 


Proof: Let ctc = (G, H) = (G, H)* and ctc = (G, H) = (G’, HH’). From 
the assumptions we made and the fact that ctc and ctc’ are h-equivalent, 
it follows that h(z.(¢ + 1)) = a’.(9 +1), G = GW {x.4+1)} and G = 
G' wW {r’.(j + 1)} as well as: 


neat(H) = next(H) \ {r.(i+1)}U {2.(i + 2)} 
nett(H) = next(H’)\ {n'.(j7 +1)}U{r'.(9 + 2)}. 


We then extend h to h by adding h(x.(i + 2)) = x/.(j +2). Then h isa 
bijection which follows from 7.(i+2) ¢ G and x’.(j7 +2) € G’ which in turn 
follows from CTC2 for cte and ctc’, respectively. One can easily check that 
ctc and ctc are h- equivalent ct- douifinuration: 


We can now prove that Assumption 3 holds for the class of Petri nets 
introduced in this paper. 


Sketch of the proof of Theorem 3: We first observe that t with the 
binding ho @ is enabled at the marking M’. In particular, the guard of t 
evaluates to true (as in the case of G and M) since the pids can only be in- 
volved in comparisons using the operators in (Q,;q and the ct-configurations 
ctcyy and ctcjyy are h-equivalent. We then observe that the existence of a 
suitable bijection h satisfying M ~; M M’ comes from zero or more applica- 
tions of Proposition 4 followed by me most two applications of Proposition 3. 
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